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THE  SIZE  DISTRIBUTION  OF  OIL  AND  GAS  FIELDS 
1.   Introduction 

Statistical  decision  theory  is  the  mathematical  analysis  of  decision 
problems  in  which  uncertainty  is  a  key  element.  Use  of  it  can  give  the  oil 
and  gas  operator  ordered  insight  into  the  way  exploration  decisions  should 
be  made  in  order  to  achieve  preassigned  goals;  e.g.  maximize  the  net 
expected  present  value  of  an  exploratory  drilling  program.   It  does  this 
by  forcing  the  decision-maker  to  make  explicit  assumptions  and  judgements 
that  are  implicit  in  every  such  decision  problem.   In  other  words, 
statistical  decision  theory  is  a  vehicle  for  rendering  precise  the  key 
variables  and  relations  among  them  that  constitute  the  core  of  exploration 
problems.   It  is  important  to  recognize,  however,  that  people--not 
mathematical  models--make  decisions,  so  statistical  decision  theory  is 
only  an  aid  to  decision-making,  not  a  decision-making  device. 

Rather  than  emphasize  the  specifics  of  how  statistical  decision 
theory  comes  into  play  in  analysis  of  exploration  problems,  in  this  paper 
we  will  concentrate  on  illustrating  how  statistical  methodology  can  be 
used  to  build  a  probabilistic  cornerstone  of  mathematical  models  of 
some  important  exploration  decision  problems. 

As  pointed  out  in  [1],  in  any  decision  concerned  with  the  strategy 
and  tactics  of  oil  and  gas  exploration,  a  key  variable  is  the  size  of 
hydrocarbon  deposits  in  barrels  of  oil  or  in  MCF  of  gas.   The  size  of  pool 
or  field"!  discovered  in  a  particular  wildcat  venture  determines  the  degree 


'For  convenience  we  shall  call  a  hydrocarbon  deposit  a  "pool"  or  a 
"field",  even  though  the  terms  differ  in  usage  in  that  a  new  "pool"  can 
be  found  within  an  already  discovered  "field".   By  convention,  "size" 
will  always  refer  to  size  in  barrels  of  oil  or  in  MCF  of  gas.   The  word 
"area"  will  denote  areal  extent. 


to  which  the  venture  is  an  economic  success.  Since  the  pool  or  field 

size  that  will  be  discovered  is  almost  always  unknown  before  a  prospect 

is  drilled,  an  important  question  is: 

What  functional  form  of  distribution  function  should  be 
used  to  characterize  the  probability  distribution  of  field 
sizes  in  a  petroleum  province? 

By  "functional  form"  we  mean  a  mathematical  formula  which  defines  a 

family  of  distribution  functions. 

Clearly  the  functional  form  used  to  characterize  the  size 
distribution  of  oil  and  gas  fields  is  a  vital  part  of  any  model  which 
you  as  decision  makers  might  use  to  analyze  exploration  decisions. 
Ideally,  we  would  like  this  form  to  be  flexible  enough  to  fit  a  wide 
variety  of  empirical  histograms  of  oil  and  gas  fields  in  differing 
areas  with  differing  definitions  of  reserves  by  varying  only  the  value 
of  the  parameters  of  the  form,  not  the  form  itself.  We  also  would  like 
it  to  be  analytically  tractable,  so  that  it  may  be  easily  used  in  the 
course  of  a  formal  analysis  of  exploration  decision  problems;  e.g.  by 
use  of  statistical  decision  models.   The  Lognormal  functional  form  has 
these  properties.  I" 

In  addition  to  possessing  these  properties  the  Lognormal 

distribution  has  other  desirable  attributes; 

1,   it  may  be  shown  to  be  in  concordance  with  some 
concepts  of  how  mineral  deposits  are  formedj 


'A  random  variable  is  said  to  be  "Lognorraally  distributed",  if 
the  logarithm  of  the  random  variable  is  Normal  or  Gaussianj  cf . 
Appendix  A. 
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2.  stochastic  models  of  the  discovery  process  built  on 
reasonable  assumptions  about  the  process  lead  to 
the  Lognormal  functional  form; 

3.  the  class  of  Lognormal  functional  forms  is 
analytically  tractable  and  flexible  enough  to 
capture  most  reasonable  oilmen's  subjective 
betting  odds  about  random  variables  such  as 
reported  field  size. 

A  detailed  discussion  of  these  points,  a  development  of  several 
methods  for  blending  subjective  probability  beliefs  of  experts  about 
field  sizes  that  are  not  know  with  certainty  with  objective  evidence, 
and  application  of  these  methods  to  some  typical  exploration  decision 
problems--notably  drilling  decision  problems--are  given  in  [1]  and  [2], 
Both  references  show  how  concepts  from  statistical  decision  theory  can 
be  used  to  analyze  such  problems. 

My  purpose  here  is  three-fold:   first,  to  show  that  the  Lognormal 
distribution  provides  a  reasonable  fit  to  empirical  histograms  of 
reported  oil  fields  sizes,  where  "size"  is  measured  in  barrels  of 
ultimate  primary  recoverable  reserves;  second,  to  illustrate  how  we 
can  use  the  Lognormal  distribution  to  describe  systematically  one 
inqjortant  dimension  of  the  discovery  process--the  build-up  over  time 
of  such  histograms;  and  third,  to  demonstrate  with  a  hypothetical 
example  how  we  can  use  properties  of  the  Lognormal  distribution  as 
an  aid  in  assessing  how  the  probability  distribution  of  field  sizes 
remaining  to  be  discovered  in  a  basin  varies  with  time.   Knowing 
properties  of  this  last  mentioned  probability  distribution  is  extremely 
useful  in  assessing  the  effect  of  time  of  entry  on  the  expected 
profitability  of  a  major  exploration  program  in  a  given  basin. 


2.   Empirical  Histograms  of  Reported  Oil  Field  Sizes 
2,1  Data  Sources 

Oil  and  Gas  Journal  statistics'  on  reported  field  sizes  are  plotted 
in  Exhibits  1  through  9  to  demonstrate  how  the  Lognormal  functional 
form  can  be  fitted  to  empirical  histograms  of  reported  oil  field  sizes 
and  to  show  how  we  may  characterize  the  change  over  time  of  these 
histograms  using  only  a  pair  of  numbers--the  parameters  of  the 
Lognormal  distribution--for  each  point  in  time. 

These  statistics  are  not  ideal  for  our  purposes  for  several 
reasons:   first,  thinking  of  reported  field  sizes  as  shown  in  the 
Oil  and  Gas  Journal  for,  say,  Oklahoma  in  1954  as  a  sample  from  some 
true  underlying  size  distribution  of  oil  fields,  the  sanple  is  truncated. 
That  is,  of  all  fields  discovered  as  of  a  given  year,  only  those 
"fields  with  production  of  at  least  1000  barrels  per  calendar  day"TT 
are  included  in  the  sample.  Second,  there  is  a  great  deal  of  reporting 
bias  present  in  these  estimates  of  field  sizes,  especially  in 
estimates  of  the  size  of  "younger"  fields  before  substantial  production 
experience  with  them  has  accrued.   Third,  the  meaning  of  the 
definition  of  field  size  as  "total  ultimate  primary  reserves"' ' ' 


'The  sources  of  these  statistics  are  the  Oil  and  Gas  Journal  Annual 
Review  Issues  for  1952,  1954,  1556,  1960,  and  1964.  See  Table  I. 

ricf.  Oil  and  Gas  Journal,  Annual  Review  Issue,  Vol.  58,  No.  4 
(January  25,  1960)  p.  163.   Major  fields  with  an  ultimate  recovery  estimate 
of  100  million  barrels  or  more  are  included  regardless  of  daily  production. 


m 


Ibid. 


recoverable  from  the  field  changes  as  the  technology  of  recovery  changes. 
Fourth,  the  usual  definition  of  a  field  as  consisting  of  one  or  more 
pools  vertically  separated  from  one  another  but  having  similar  areal 
boundaries  or  outline  is  flexible  and  is  not  operationally  precise  in 
every  instance;  e.g.  the  Davenport  "field"  in  Oklahoma  is  listed  as 
one  "field"  in  the  Oil  and  Gas  Journal,  but  reported  in  terms  of  six 
subdivisions  (fields)  in  the  International  Oil  and  Gas  Development 
Yearbook. 1" 

However,  we  can  use  these  statistics  as  an  illustrative  vehicle, 
keeping  in  mind  that  any  company  undertaking  a  study  of  field  size 
distributions  for  the  purposes  mentioned  earlier  can  improve  the  quality 
of  such  data  by  confining  their  analysis  to  geological  basins  instead 
of  geographical  areas  such  as  states,  and  by  using  more  refined  sources 
of  information.'' 

Table  I  lists  the  issues  from  which  the  data  used  here  was  taken, 
as  well  as  the  definitions  of  reserves  used  in  each  year  and  the  criteria 
for  including  a  field  in  the  list. 

2.2   Methodology 

For  both  Oklahoma  and  South  Louisiana  we  carried  out  these  steps: 

1.   For  each  of  the  years  1952,  1954,  1956,  1958,  1960,  1962, 
and  1963  we  calculated  an  estimate  of  ultimate  primary 
reserves  of  each  field  by  adding  "proved  remaining  reserves" 


■"1962,  Part  II,  Vol.  XXXII,  p.  137  ff. 


H 


See  [3]  for  example. 
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to  "cumulative  production  to  date",  and  then  ordered 
these  estimates  from  smallest  to  largest. 

2.  The  ordered  estimates  were  plotted  as  described  in 
subsection  2.3  and  displayed  in  Graphs  1  through  9. 

3.  A  Lognormal  distribution  was  fitted  to  each  plot  and 
its  parameters  were  estimated  two  different  ways. 
(Table  III). 

4.  Since  the  shape  of  a  Lognormal  distribution  is  completely 
determined  once  its  two  parameters  are  specified,  we 
summarized  the  character  of  each  of  the  nine  graphs 

by  specifying  the  parameters  of  the  Lognormal 
distribution  fitted  to  it.   (Graph  9). 

5.  In  Graph  9  estimates  of  the  mean,  median,  and  mode 
of  each  frequency  histogram  of  field  sizes  are  dis- 
played so  as  to  trace  the  manner  in  which  they  change 
over  time. 

As  stated  earlier,  the  probability  distribution  of  field  sizes 
remaining  to  be  discovered  in  an  area  as  of  a  given  point  in  time  is 
of  critical  importance  in  assessing  the  expected  profitability  and  the 
"risk"  of  pursuing  an  exploration  program  in  that  area  at  that  point  in 
time.   The  effect  of  timing  on  the  expected  profitability  and  "riskiness" 
of  an  exploration  program  can  only  be  explicitly  assessed  if  one  has  a 
systematic  way  of  showing  how  the  odds  of  discovering  economically 
viable  fields  changes  with  time.  To  this  end  we  carried  out  steps  6  and  7. 

6.  We  used  the  parameter  estimates  shown  in  Table  III 
together  with  some  hypothetical  subjective  probability 
judgements  about  the  true  underlying  size  distribution 

•  •         of  oil  fields  in  barrels  to  compute  a  probability 

distribution  of  field  sizes  remaining  to  be  discovered 
in  South  Louisiana  as  of  1932  and  as  of  1964. 

7.  We  calculated  the  mean  and  variance  of  field  sizes 
remaining  to  be  discovered  in  South  Louisiana  as  of 
1952  and  as  of  1964  using  the  results  of  step  6  as  an 
example  of  how  the  first  two  moments  of  the  probability 
distribution  of  field  sizes  remaining  to  be  discovered 
changes  with  time. 


TABLE  I 


SOURCES  OF  DATE  USED  IN  EXHIBITS  1  THROUGH  9 


+ 


Year 


1952 


1954 


Vol.  No. 
and 
Page 

Vol.  51 
No.  38 
p.  289 

Vol.  53 
No.  39 
p.  197 


Definition  of  Reserves 

'...estimates  of  proved  remaining  . 
reserves  of  crude  oil,  condensate,  and 
cycled  products  for  the  country's 
larger  producing  fields." 

same  as  1952 


Criteria  for 
Inclusion 

"Larger  producing  fields" 


same  as  1952. 


Vol.  55 
1956  No.  4 
p.  159 


Vol.  57 
1958  No.  4 
p.  141 


Vol.  59 
1960  No.  5 
p.  126 

Vol.  61 
1962  No.  4 
p.  172 


same  as  1952 


"...estimated  remaining  reserves  of  crude 
oil  and  condensate  for  the  larger  fields 
of  the  United  States.   Eligibility  for 
this  list  requires  a  1958  production  rate 
of  at  least  1,000  barrels  per  calendar 
day.  Figures  refer  to  primary  recovery 
only." 

The  last  sentence  above  is  amended  to 
read:   "Figures. . .in  the  majority  of 
cases  refer  to  primary  recovery  only." 


same  as  1960 


"Those  major  fields  with 
an  estimated  ultimate  re- 
covery of  100  million 
barrels  or  more  are... 
included  here  regardless 
of  present  daily  produc- 
tion. Fields  included  all 
produce  1,000  barrels  or 
more  per  calendar  day." 


same  as  1956 


same  as  1956 


same  as  1956 


■f 


All  Vol.  references  are  to  the  Oil  and  Gas  Journal, 
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2.3  Plotting  the  Data  and  Curve  Fitting 

The  total  number  N  of  fields  in,  say,  South  Louisiana  discovered  to 
date  constitutes  a  sample  of  fields  from  the  true  underlying  size  distribution 
of  fields  in  South  Louisiana,  and  the  number  n  <  N  of  fields  reported  in  the 
Oil  and  Gas  Journal  constitute  a  very  particular  sub-sample  from  this  sample. 

Provided  that  we  know  N,  we  may  use  this  sub-sample  to  test  the 
Lognormality  of  the  size  distribution  of  oil  fields  in  South  Louisiana  as 
follows: 

1.  List  the  n  sub-sample  observations  in  order  of  size,  from 
smallest  to  largest. 

2.  Consider  the  kth  largest  observation  as  an  estimate 

of  the  (N-n+k)/(N+l)  st  fractile  of  the  true  underlying 
distribution  of  field  sizes. -j- 

3.  Plot  the  fractile  estimates  on  Lognormal  probability  paper. 

4.  Fit  a  straight  line  to  the  data. 

If  the  sample  observations  are  from  a  Lognormal  population,  then  the 
plotted  points  should  lie  approximately  on  a  straight  line.   The  larger 
the  number  of  sample  observations  and  the  closer  the  fit  to  a  straight 
line,  the  more  reasonable  the  assumption  of  Lognormality  becomes. 

In  order  to  determine  N,  we  first  counted  the  number  of  fields  in 
Oklahoma  and  in  South  Louisiana  as  listed  in  the  International  Oil  and  Gas 
Development  Yearbook  1962.^  In  some  instances,  the  definition  of  a  field 
in  this  reference  differed  from  that  given  in  the  Oil  and  Gas  Journal.   The 
fields  listed  in  the  former  reference  were  aggregated  when  necessary  so 


-fAny  value  which  is  both  (a)  equal  to  or  greater  than  a  fraction  .f  of 
the  values  in  the  set  and  (b)  equal  to  or  less  than  a  fraction  (l-,f)  of  the 
values  in  the  set  is  a  .f  fractile  of  the  set. 

Part  II,  Vol.  XXXII,  p.  137ff. 


as  to  make  the  field  definition  correspond  with  that  being  used  in  the  Oil 
and  Gas  Journal  and  then  N  was  calculated.   This  aggregation  had  a  significant 
effect  on  N  only  in  Oklahoma,  where  many  of  the  fields  listed  in  the  Oil 
and  Gas  Journal  are  listed  as  two  or  more  geographic  subdivisions;  e.g. 
Davenport  in  the  Oil  and  Gas  Journal  is  listed  in  the  Yearbook  as  Davenport, 
Davenport  North,  Davenport  Northeast,  Davenport  South,  Davenport  Southeast, 
and  Davenport  West. 

TABLE  II 

Number  of  Fields  Discovered 

Aggregated 

271 
125 
508 
633 

Oklahoma  2704  1361 

Here,  however,  another  difficulty  arises,  for  N  is  well  over  100  for 
both  Oklahoma  and  South  Louisiana.   The  largest  sample  observation  in 
Oklahoma  corresponds  to  a  .9996  fractile--and  Lognormal  probability  paper 
presently  in  use  allows  one  to  plot  fractile  numbers  of  the  order  of  .99  or 
less.   Since  we  are  particularly  interested  in  the  behavior  of  tne  right  tail 
of  the  distribution,  ordinary  Lognormal  probability  paper  is  unsuitable. 

One  way  of  overcoming  this  difficulty  is  to  plot  sample  values  against 
standardized  Normal  units  as  shown  in  Exhibits  1  through  9.   The  procedure 
used  here  is  justified  mathematically  and  described  in  detail  in  Appendix  A. 


As 

Listed 

in 

Yearbook 

North  Louisiana 

328 

South  Louisiana--of fshore 

131 

--onshore 

641 

--combined 

772 
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As  with  Lognormal  probability  paper,  if  the  sample  observations  are  from 
a  iiOgnormal  population  then  the  plotted  points  in  the  upper  right  tail  of  the 
plot  should  lie  approximately  on  a  straight  line. 

Once  the  data  is  plotted,  a  straight  line  can  be  fitted  by  eye, '  and 
the  resulting  line  used  to  give  graphical  estimates  p,   and  a   of  the 
parameters  ^  and  o  of  a  Lognormal  distribution  as  described  in  Appendix  A. 
Alternately,  one  may  estimate  p.  and  o  using  actual  data  points.    Both 
ways  of  estimating  the  parameters  of  Lognormal  distributions  fitted  to 
the  plots  of  Exhibits  1  through  9  were  used  and  the  results  are  displayed 
in  Table  III. 

A  more  informative  display  of  properties  of  these  fitted  distributions-- 
means,  medians,  and  modes--is  given  in  Exhibit  9. 

2.4  Discussion  of  Exhibits  1  through  9 

Some  important  attributes  of  these  Exhibits  are: 

1.  All  of  the  plots,  instead  of  being  linear,  have  a 
pronounced  downward  bend  in  the  left  tail  of  plotted 
points. 

2.  By  1963,  a  straight  line  fits  the  right  tail  of  both 
the  Oklahoma  and  South  Louisiana  histograms  reasonably 
well, 

3.  The  estimated  mean  of  reported  field  sizes  for  South 
Louisiana  steadily  increases  from  1952  to  1963,  while 
that  of  Oklahoma  slightly  increases  from  1952  to  1960 
and  than  decreases.   By  1963,  the  estimated  mean 


4 

'For  a  discussion  of  errors  introduced  by  fitting  a  line  by  eye  see  [4], 
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Exhibit    l-FRACTILES   OF  ULTIMATE   PRIMARY 
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OFFSHORE  AND  ONSHORE  -  AS  OF 
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Exhlblt   2-FRACTILES   OF  ULTIMATE   PRIMARY 
RESERVES    IN  SOUTH   LOUISIANA - 
OFFSHORE   AND  ONSHORE-  AS  OF 
31  DECEMBER    1956    (N=633,  n  =  l24) 
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Exhibit    3-FRACTILES   OF  ULTIMATE    PRIMARY 
RESERVES    IN  SOUTH   LOUISIANA- 
OFFSHORE   AND  ONSHORE-  AS  OF 
31  DECEMBER    I960    (N  =  633,  n=l52) 
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Exhibit   4-FRACTILES   OF  ULTIMATE   PRIMARY 
RESERVES    IN   SOUTH   LOUISIANA - 
OFFSHORE    AND   ONSHORE-  AS  OF 
31  DECEMBER    1963    (N=633,  n  =  171) 
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Exhibit   5-FRACTILES   OF  ULTIMATE    PRIMARY 
RESERVES    IN   OKLAHOMA- AS    OF 
31  DECEMBER    1952   (N=l3SI,n  =75) 
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Exhiblt   6-FRACTILES   OF  ULTIMATE   PRIMARY 
RESERVES    IN  OKLAHOMA  -  AS  OF 
31  DECEMBER   1956     (N=l36l,n  =  85) 
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Exhibit    7-FRACTILES   OF  ULTIMATE   PRIMARY 
RESERVES    IN   OKLAHOMA  -  AS  OF 
31  DECEMBER    I960    (N  =  I36I.  n  =  75) 
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Exhibit  8-FRACTILES   OF  ULTIMATE    PRIMARY 
RESERVES    IN    OKLAHOMA  -  AS  OF 
31  DECEMBER   1963  (N  =  133I,  n=86) 
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reported  field  size  in  South  Louisiana  is  roughly 
twice  that  of  Oklahoma. 

4,   Estimates  of  median  and  modal  field  sizes  are 

extremely  small  by  comparison  with  estimated  means 

so  that  the  distributions  are  all  highly  skewed  to 
the  right. 

The  downward  bend  in  the  left  tail  of  all  points  is  probably  due  in 
great  part  to  the  peculiar  criteria  for  inclusion  of  fields  in  the  Oil  and 
Gas  Journal  listing--1000  or  more  barrels  of  production  per  calendar  day. 
This  undoubtedly  results  in  many  "older"  fields  with  ultimate  primary 
reserves  of  1  to  20  million  barrels  being  omitted  from  the  listing^  and 
resultant  the  "thinness"  of  listed  observations  in  this  range  causes  the 
downward  bend.   (A  careful  cataloguing  of  smaller  fields  would  determine 
whether  or  not  this  conjecture  is  true.) 

If  we  focus  on  the  right  tail  of  the  plot,  however,  and  discount  the 

smaller  values  by  fitting  a  straight  line  to  those  fields  of  a  size  greater 

than  40  million  barrels  in  South  Louisiana  and  in  Oklahoma  then  the  fit 

looks  reasonable.   As  can  be  seen,  this  was  in  fact  what  was  done  in  the 

visual  fitting  of  these  lines.   The  justification  for  fitting  only  plotted 

points  far  out  in  the  right  tail  with  a  straight  -line  and  then  using  the 

line  to  make  inferences  about  the  parameters  of  the  whole  distribution  is  that 

IF  THE  TRUE  UNDERLYING  DISTRIBUTION  IS  LOGNORMAL,  AS  THE 
NUMBER  OF  "LARGE"  SAMPLE  OBSERVATIONS  INCREASES,  A 
STRAIGHT  LINE  FITTED  TO  PLOTS  OF  SAMPLE  POINTS  IN  THE 
RIGHT  TAIL  OF  HISTOGRAMS  LIKE  THOSE  DISPLAYED  IN 
EXHIBITS  1  THROUGH  9  WILL  ASYMPTOTICALLY  LEAD  TO  ESTIMATES 
OF  ^  AND  a  CLOSE  TO  THEIR  TRUE  VALUES  WITH  HIGH 
PROBABILITY. 

Notice  that  both  graphical  and  calculated  estimates  of  parameters  and  of 

means,  medians  and  modes  are  extremely  close.   (Exhibit  9).   The  regularity 
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of  behavior  of  estimates  of  means  seems  to  tag  them  as  better  indicators  of 
the  change  of  the  histograms  with  time. 

3.   The  Probability  Distribution  of  Field  Sizes  Remaining  to  be  Discovered 

In  order  to  calculate  the  probability  distribution  of  field  sizes  remaining 
to  be  discovered  in  an  area  at  a  given  point  in  time  we  must  make  some 
assumptions  about  the  true  size  distribution  of  fields  in  the  area,  total 
ultimate  recovery  from  the  area,  and  the  manner  in  which  the  histogram  of 
reported  field  sizes  and  the  true  size  distribution  of  fields  are  interrelated. 

3.1   Definitions  and  Assumptions 

For  a  given  area  and  time  periods  t=0,l,2,.,,,  define 

r   -  reported  size  in  barrels  of  a  generic  field  discovered  by 
time  t, 

■  s   -  true  size  in  barrels  of  a  generic  field, 

z   -  size  of  a  generic  field  undiscovered  by  t  (unadjusted 
for  reporting  bias), 

R  -  total  ultimate  primary  reserves  recoverable  from 
fields  discovered  by  time  t, 

S   -  total  ultimate  primary  reserves  recoverable  from 
'  .  .  all  fields  in  the  area,  . 

We  make  three  assumptions: 

I  -   The  random  variable  s  is  Lognormally  distributed  with 

■  -2 
parameter  (^^,0^  ): 

s~£j^(s|^^,a;2)  .  I  ■■;• 

II  -   The  random  variable  ?  is  Lognormally  distributed  with 

-2  ■ 

parameter  (H(.|o^  ): 

~      ,       -2 

r^  -  ^L^'^tl^^t^°t  )   • 
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3.2  Derivation 

We  may  informally  regard  the  process  of  discovering  fields  as 
hypergeometric--like  sampling  from  a  Lognormal  population  in  which  "area" 
under  the  probability  distribution  function  of  s  is  "used  up"  as  sampling 
progresses;  i.e.  as  more  and  more  fields  are  discovered.   In  [1]^  it  is 
shown  that  as  the  number  of  discovered  fields  increases^  the  frequency 
histogram  of  these  fields  asymptotically  approaches  a  Lognormal  functional 

form  if  I  above  is  true. 

-2 

More  precisely^  I  and  II  imply  that  given  R^  S,  (|i(^,o^  ),    and 

-2 
(|i  ^o   ),  the  probability  distribution  of  field  sizes  remaining  to  be 

'•^^ 
discovered  z  is  ■  . 

R 

K[fJz|M^,a;2)  --^fjzl,^,  ol^)]    , 

z  >  0,  a^  >  0,  a  >  0  , 
'   t    '  s     •* 

-»  <  ^j.  <  ■+»  ^ 

-00  <  Hg  <  +00  ^ 

where  k  is  a  normalizing  constant. 

The  idea  of  characterizing  the  probability  density  function  of  z  as 
in  (1)  may  be  shown  graphically  this  way;' 


'This  view  of  the  process  is  simply  a  probabilistic  adaptation  of  a 
deterministic  model  proposed  by  J.J.  Arps  and  T.G.  Roberts,  [3]. 
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At  tr=0 


^^\^s'%   > 


At  t=t'  >  0 


S-^L^'^t'l^^f^'^f) 


t        I      ~i 


At  t=t"  >  t' 


The  probability  density  of  field  sizes  remaining  to  be  discovered  is  roughly 
proportional  at  each  point  in  time  to  the  height  of  the  shaded  area.   While 
we  could  in  fact  calculate  this  density  function  more  accurately,  a  great 
deal  of  analytical  simplicity  is  achieved  by  first  approximating  the  shape 
of  the  sample  histogram  with  a  Lognormal  distribution  and  then  using  (1),  as 
we  shall  see.   One  can  use  the  methods  described  in  section  2  to  do  this. 


And  R  may  be  directly  calculated. 


-2 


Here  we  assume  that  the  parameters  ^  ,    and  a       have  been  derived  using 

s        ^ 

the  subjective  judgements  of  geologists  in  the  manner  described  in  Chapter  6 
of  [1],  and  that  S  has  been  estimated  by  a  gross  volumetric  analysis  of  the 
area. 
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Using  (1),  we  may  directly  calculate  moments  of  the  distribution  of  z. 

Its  mean  and  variance  are 

2  2 

■^  Hg  "^^  S         t   ^^t  '^  t 

E(2)  =   K  e  -    K  —  e  , 

~      -v/2     2  ~        '^s        s      L   f^t    t    „2  ~ 
V(2)  =  E(z  )  -  E  (2)  =  K  e        ■  "^  T  ^  ^^^  • 

and  the  kth  partial  moment  of  z  about  the  origin  is  as  shown  in  formula  (2) 
of  Appendix  B. 

3 , 3  An  Hypothetical  Example        •      • 

To  illustrate  how  these  results  may  be  used,  suppose  we  wish  to  determine 
the  probability  distribution  of  z  for  South  Louisiana  as  of  1963. 

A  geological  and  geophysical  evaluation  has  led  us  to  specify  that  as  of 

1963, 

n   =  1.5     ,     a^  =  3.0       ,      S  =  25000  x  10^   . 
^s  '      s 

2 
Letting  1900  be  t=0,  taking  estimates  of  \x,^,    and  a   from  Table  III,  and 

calculating  R^^^,  we  may  use  (1)  to  calculate  the  probability  density  function 

f  (.)  of  z  as  of  1963,   Aside  from  the  normalizing  constant  K^n,    some  right 


63- 
tail  values  of  f  (•)  for  1963  are  shown  in  column  (6)  of  Table  V. 


-1 


In  a  similar  fashion  we  calculated  some  right  tail  values  of  ^52  ^  (*) 
for  comparative  purposes.   These  are  shown  in  column  (5)  of  Table  IV. 

Values  such  as  those  displayed  in  Table  TV  were  used  to  plot  the  curves  of 
Exhibits  10  and  11.   Note  that  these  exhibits  give  a  typical  exar-ple  of  how 
the  right  tail  of  the  distribution  of  s  is  "used  up"  as  the  86  fields  used 
to  calculate  fractile  estimates  as  of  1952  (Exhibit  1)  increase  to  171  fields 
in  1963.   (Exhibit  4)  '   .  ' 

Using  the  formulae  of  Appendix  B,  we  may  calculate  the  moments  shown  in 
Table  VI. 
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TABLE  V 

P(z  > 

^) 

£_ 

1952 

1963 

4 

.525 

.493 

5 

.475 

.461 

• 

6 

.435 

.398 

7.5 

.388 

.355 

' 

10 

.327 

.294 

20 

.198 

.166 

30 

.142 

.116 

40 

.110 

.089 

50 

.087 

.067 

75 

.052 

.036 

100 

.040 

.030 

200 

.015 

.011 

300 

.009 

.006 

. 

400 

.0055 

.004 

500 

.0033 

.002 

1000 

.0010 

.0006 

1952 

1963 

Mean  of  z 

21.04 

16.84 

Variance  of  z 

8486.9 

4984.6 

Partial  expectation 

of  z  >  40  X  106  bbls.    15.2         10.8 

Conditional  Expected 

Field  Si'^e  given 

z  >  40  138.2       121.3 
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It  follows,  that  as  of  1963,  under  the  assumptions  made  earlier  in  this 
example  about  s  and  r   that: 

(1)  The  probability  of  a  newly  discovered  field  being 
larger  than  AO  million  barrels  is  ,089. 

(2)  Conditional  upon  discovering  a  new  field  of  40 
million  barrels  or  larger,  its  expected  (reported) 
size  is  121.8  million  barrels. 

4.   Summary 

We  have  shown  that  the  Lognormal  functional  form  is  a  reasonable  and  useful 
tool  to  use  in  the  course  of  analytically  characterizing  the  probability 
distribution  of  the  size  of  fields  remaining  to  be  discovered  in  a  basin.   In 
section  2  we  developed  a  method  for  fitting  Lognormal  functional  forms  to 
empirical  histograms  of  reported  field  sizes  even  when  the  samples  from 
which  these  histograms  are  constructed  are  truncated.  We  then  showed  how  to 
calculate  estimates  of  the  parameters  of  the  Lognormal  distributions 
according  to  which  the  samples  were  assumed  to  have  been  generated. 

Finally,  in  section  3  we  illustrated  by  example  how  the  results  of  the 
analysis  of  sections  1  and  2  can  be  used  to  calculate  a  probability 
distribution  of  fisldsizts  remaining  to  be  discovered  in  an  area,  as  well  as 
to  calculate  the  moments  of  this  distribution. 
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APPENDIX  A 
Mathematical  Derivation  of  Plotting  Method  Used  In  Exhibits  1  through  9 

Let  X  be  a  Lognormal  random  variable  whose  distribution  function  has  parameter 

2 
(u,  cr  )  so  that  ,      « 

X  ~  f^CxI^i,  cj  )  =  y^  e    °      -  ,  (1) 

x>0,  n>0^  a>0  , 


and 


X 

P(x  <  x)  ==  /  f  (z|n,  a  )dz  s  Fj^(x|n_,  a  )  . 


(2) 


0 
Then  defining 


u  =  [logX-|j]/G  ,  (3) 

we  have  "9 

Fj^Cxl^,  0^)  =F^Ju)^^  j  e"^^  dv  .  (4) 

-00  , 

Define  u.  to  be  that  value  of  u  such  that 
1 

B   /   \    c    N-n+i  ... 

V<"i>  ^  ^i  "  Im-  •  (5) 

Let  x^,,..,x  be  a  sequence  of  n  independent  sample  observations^  each  generated 

according  to  (1)  ,    ordered  as  to  size  (x.  >  x . ,  n  >  i  :$.  j  :^  1)  ,  and  constituting  the 
n  largest  sample  observations  from  a  samplfe  of  ^ize  N  >  n.  ^ 

If  we  regard  x.  as  an  estimate  of  the  (N-nf i/N+l)st  fractile  of  the  distribution 

of  X,  it  follows  from  (3)  and  (5)  that  4 


logx.  =  au.  +  |i  .  (6) 

Given  f.  we  may  compute  u.^  from  any  table  of  the  standardized  Normal  Cumulative 
Distribition  Function. 

Hence  knowing  x.  and  f.  determines  a  linear  relation  between  logx.  and  u 
expressed  in  units  of  a  up  to  the  additive  constant  ^j  e.g. 


1 


J-o^-jC 


^■X-    ffu- t  f*- 


<ru. 


Given  an  ordered  sequence  x. ,...;X  of  sample  observations  we  may  display 
X.  ,  f .  ,  and  u.  for  i=l,2,...,n  as  below: 

Units  u  of  a 


X        N-n+l/N+1  u 

^2     N-n+2/N+l  U2 


X  N/N+1  u 

n  '  n 

We  may  plot  ordered  pairs  (u.,  logx.)  on  a  graph  such  as  that  shown  above,  and 

fit  a  straight  line  to  the  plotted  points.   If  in  fact  the  x.  are  generated  according 

to  (1)  and  n  is  "large"  then  the  straight  line  should  fit  the  data  well,  for  the 

(u.,  logx.)  must  satisfy  (6).   The  intercept  of  the  line  with  the  vertical  axis 

yields  a  graphical  estimate  of  ^.  The  slope  of  the  line  yields  a  graphical  estimate 

of  0.   The  graphical  estimates  displayed  in  Exhibit  9   were  determined  in  this 

fashion. 


APPENDIX  B 

Derivation  of  Partial  Moments  of  the  Probability  Distribution  of 
Reserves  Remaining  to  be  Discovered 

From  Assumption  I  and  II  we  have 

^t  ~  ^L^^l^^t^  ^t^^ 
and  2 

Using  I  and  II,  we  derived  the  probability  distribution  (  1  )  of  z,  reserves  remaining 

to  be  discovered: 

R 

The  kth  partial  moment  about  the  origin  of  z  is  easily  found  using  formula 
(7.8b)  of  [1]:         ■     ■ 

0     0  0  9 

EqCz  )  ^  /  y''  f^(y)dy  =  «  e   '       V^^^^^^  "  '^  T  ^  Fn*^^2^^^^ 

0 

where  _•■ 

Wj(a)  =  [log  a  -  n^]ag  -  kOg 


and  _,      ■ 

vi^ia)   =   [log  a  -  H|.]a^  -  ka^  ,   andL 

where  for  notational  simplicity  we  let  f  (•)  represent  the  probability  density 

z 

function  (  1  )  . 

Proof:   Observe  that   ^ 

R 

E^ah   =   K  /y''[fL(y|^i3,  a;2)  .  -|  f^(y|^^,  a;2)]dy 

0 

=  kJ  y^  v^'^^s'  ^^^'^y  ■  -^  f  I^L^yK'  °t^^^y- 

0  0 

Then  (  2   )  follows  directly  from  (7.&)  of  [1]. 


It  follows  that  the  formula  for  the  incomplete  kth  moment  of  z  from  a  to  »  is 

2  2  ?  2 


where 


S*<*>  =  ^'V^'^    • 
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